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PROJECTED MULTILEVEL MONTE CARLO METHOD FOR 
PDE WITH RANDOM INPUT DATA 

MYOUNGNYOUN KIM AND IMBO SIM 


Abstract. The order of convergence of the Monte Carlo method is 1/2 which 
means that we need quadruple samples to decrease the error in half in the nu¬ 
merical simulation. Multilevel Monte Carlo methods reach the same order of 
error by spending less computational time than the Monte Carlo method. To 
reduce the computational complexity further, we introduce a projected mul¬ 
tilevel Monte Carlo method. Numerical experiments validate our theoretical 
results. 


1. Introduction 


The partial differential equation (PDE) with random input data jT9) [20l l26| 
takes a part of the stochastic partial differential equation (SPDE) which 
describes the problem with uncertain inputs [221 [381 [39] as follows 

( C(x] uj)u(x', to) = F{x\ lo), x £ D, well, 

(!) 


B(x\u))u{x\ijj) = Q(x;co), 


x £ dD, uj £ 9. 


Here, C, B , T , and Q are the stochastic partial differential operator, boundary 
operator, forcing term, and boundary value term in the spatial domain D with its 
boundary dD and the range 9 of input uncertainties, see 018] for the stochastic 
formulation of elliptic boundary value problems. To ensure the regularity of the 
solution of m, Babuska et al. 0H] assume that the diffusion coefficients in C 1 ( D ) 
are bounded, uniformly coercive in the convex domain D and their first derivatives 
are bounded almost surely for stochastic elliptic PDEs. We represent the uncertain¬ 
ties of the problem by a random variable w, which follows some known or unknown 
distribution of the probability. We can regard a SPDE as a problem depending on 
some parameters which take certain values in finite ranges. In this parametrization 
context, |T]) is a parametrized PDE (P 2 DE) [21 ISi [ULi [32] and one wants to simulate 
faster for a sequence of input data using the information on solutions at specially 
chosen parameters mmm- Both views have their own benefits to carry on and 
difficulties to cope with. We adopt the view of a SPDE, which means that we need 
to find out the mean of many solutions corresponding to samples. Note that the 
total cost of the computation is the number of samples times the cost of solving a 
PDE with fixed inputs after proper discretization, see [18: for a detailed description 
on the cost to compute one sample. 

A simple method to obtain the mean of solutions is the Monte Carlo (MC) 
method, which requires quadruple samples to reduce the current error in half. To 
reduce the cost while keeping the almost same order of convergence, variance re¬ 
duction techniques are suggested and studied, see [25] for many variants of them, 
like control variate, antithetic variate, and so on. Among them, the control variate 


l 


2 


MYOUNGNYOUN KIM AND IMBO SIM 


introduces a variate whose mean is known. When the correlation between the solu¬ 
tion and the variate is good enough, the optimal multiplier to reduce the variance is 
estimated by a few samples. In the two level MC method, the selected variate is the 
difference between the solutions at fine and coarse grids. The multiplier is just one. 
This can be extended to the case with more levels by observing that the telescoping 
sum of differences between two consecutive solutions makes the finest solution. In 
this sense, the method gets its name, the Multilevel Monte Carlo (MLMC) method. 
Note that we use the same sample to obtain the difference of approximations at 
two consecutive grids, see [28] for the idea and the error estimation of the MLMC 
method applied to the parametric integration. There are many results on MLMC 
methods applied to path simulations [23], elliptic PDEs with random coefficients 
m ns m stochastic elliptic multiscale PDEs [T|, parabolic SPDEs [IS]. An exten¬ 
sion of the MLMC method is derived by Giles et al. [24] using the antithetic variate 
method, which is successfully applied to eliminate Levy area simulation during the 
estimation of the payoff using the first order Milstein approximation. 

Stochastic collocation (SC) methods [5;, h] 001 (41; are similar to MC methods 
except that their sample points are determined in the parameter space f2, and an 
interpolant, for example, global Lagrange type polynomials as in 01- In SC meth¬ 
ods mi ns], they use the tensor product spaces for many random variables, which 
deteriorates the convergence rate and leads to the explosion of computation since 
the number of collocation points in a tensor grid grows exponentially. Smolyak |45j 
proposes sparse tensor product spaces to reduce the number of collocation points 
when the number of random variables is moderately large. The sparse tensor prod¬ 
uct grids are constructed by either Clenshaw-Curtis OH or Gaussian abscissas. 
Recently, Teckentrup et al. m introduce the Multilevel Stochastic Collocation 
(MLSC) approach for reducing the cost of the SC method. Inspired by multi¬ 
grid solvers for linear equations, the MLSC method uses a hierarchical sequence of 
spatial approximations combined with stochastic discretization to minimize compu¬ 
tational cost under the conditions of finite dimensional noise and bounded random 
fields with uniformly bounded and coercive coefficients, see m for more details on 
the conditions. 

In the MLMC application to SPDEs, there are two main tasks such that we take 
sample from the input random field, and form a spatial discretization of the PDE 
for a fixed sample parameter and solve it, following the algorithm shown in [18j . 
The former is the usual step for MC methods. The latter causes an extra burden to 
implement the MLMC algorithm, like storing elements and corresponding building 
blocks for the stiffness matrix at the coarse grid. In this paper, we propose a new 
MLMC estimator for PDE with random coefficients, based on the original idea of 
the MLMC method in [28], which approximates the solution at the fixed sample 
in two consecutive levels and takes the difference of them. We solve the problem 
for a given sample at the fine grid, take the projection of it to the corresponding 
coarse grid and regard the projected solution as the approximation at the coarse 
grid. Obviously, this procedure does not need to solve a problem at the coarse 
grid which leads to the reduction of the computational cost. This replacement 
suggests us a name for this method, that is, the Projected Multilevel Monte Carlo 
(PMLMC) method coming from the projection procedure of the method. In the 
PMLMC method, we project the solution at a fine grid into the solution space at 
a coarse grid. The projection takes less time than the solving a problems at the 
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coarse grid. We provide the theorems on the order of convergence for the PMLMC 
method and the optimal number of samples at corresponding levels. 

In this paper, we consider the following model problem, for d = 1,2, 

- V • (k{x;u>)\7u{x)) = 0, x € D = (0, l) d , w e ft. (2) 

In ID, we apply the Dirichlet boundary condition as the boundary condition 

zt(0) = 1, u( 1) = 0, 

and introduce additional Neumann boundary conditions in 2D, for x = (a"i,a,’ 2 ) 

r)ij OU 

“(0,0:2) = 1 , “(1,0:2) = 0 , — (0:1,0) = —(0:1,1) = 0 . 

Here, an uncertain hydraulic coefficient of the Darcy flow is based on a certain 
mean and covariance structure inferred from the data describing the situation in 
the subsurface structure, see ]TSj for details. There are several ways to represent 
the random variable k in © by using a Karhunen-Loeve (KL) [22, :TTi : MJ 22] a 
polynomial chaos 12 a ma emeu and a wavelet expansion m Since the coefficient 
in ground water flow can vary very largely, we can express them in a logarithmic 
scale 13 El EH [22]. In this case, we can apply any expansion for the logarithm 
of the coefficient, instead of the coefficient itself, which leads to an exponential 
dependence of the coefficient on the random variable u> and its second moment 
might be unbounded UM- For simplicity, we expand the logarithm of the random 
conductivity in ID through the KL expansion as Cliffe et al. [IS] . 

OO 

log k{x\ w) = E(log k(x ; •)) + E \Z9^£n(v)4> n {x). 

n =1 

Here, {£ n }^i is a set of zero mean random variables uncorrelated to each other. 
The eigenvalues {0 n } ( ^ =1 and normalized eigenvectors (</>n}/ < L 1 are generated from 
the covariance operator defined by 

C(x,y) = a 2 exp , x,y G D, (3) 

where a 2 is the variance, A is the correlation length and ||'|| is the usual p-norm. 
By the choice of p = 1, a 2 > 1 and A < diamD, the coefficient k is homogeneous 
and from Kolmogorov’s theorem in [44], it belongs to C°’ V (D) almost surely with 
77 < 1 / 2 . 

A theoretical analysis of elliptic PDEs with random coefficients such as 0 is 
done in m under the condition that coefficient fields in W l,co (D) are bounded 
uniformly from above and away from zero. Charrier et al. [15] analyze when the 
coefficient is not uniformly bounded and only in C 0,ri (D) with 77 < 1/2. We follow 
the covariance relation in [T5] and show the results on the estimation of E[u] of 
the solutions of © with respect to the realizations of the randomness. We use 
the same condition in Cliffe et al. [T5] to prove the order of convergence and the 
optimal number of samples for the PMLMC method. 

This paper has the following structure. In Section [2] we provide all the prelimi¬ 
naries on probability and Bochner integrals. We analyze the order of convergence 
and optimal number of samples for MC, MLMC, and PMLMC methods in Section 
[3] The analysis on the variance of the projection is also provided in Section [3] with 
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a corollary for a hierarchical grid structure. In Section [4j numerical results of 0 
are illustrated on the order of convergence and cost savings. 


2 . Preliminaries 


Let (f2, 7i 1 P) be a probability space where SI is the sample space and its elements 
w £ fl are outcomes, Ti C 2° is the grand history of cr-algebra and its elements 
H £ H are events, and P : H — > [0,1] is the probability measure. For a measurable 
space {E,£) with the cr-algebra £ C 2 E , a mapping X : S2 -A E satisfying A -1 A £ 
Ti. for any A £ £, is an 15-valued random variable. The image px of P under an 
E -valued random variable X, 

px(A)=F[X~ 1 {A)] =F[X£A], y Ae£, 

is a probability measure on (15, £), called the distribution of X. If (F,E) is mea¬ 
surable, 15-valued and .F-valued random variables X and Y are independent if 
their joint distribution px.y is the product measure px x py on (15 x F, £ Cg> J 7 ), 
where px and py are the marginal distributions of X and Y respectively. A sim¬ 
ple 15-valued random variable A' attains only a finite number of distinct values 
{ x n}n=i C E, and has the form X = X^=i x n 1 A n where the A„ £ H are disjoint 
and 1a„ is the indicator function of A n . see M for more details. 

Let V be a separable Banach space with the norm ||-|| v , its topological dual V' 
and the Borel a -algebra B(V) for the measurable space (V,B(V)). If there exists a 
sequence {A'„} of simple P-valued random variables such that X n -^—>X, that is, 


lim \\X n — X\\ v = 0 


= 1 , 


then X is said to be strongly measurable. From [44], a Id-valued random variable 
has a sequence { X n } of simple P-valued random variables such that, for arbitrary 
w £ fl, the sequence {||X(w) — X rl (w)|| l/ } is monotonically decreasing to 0. This 
means that {lirn^oo ||A - X n \\ v = 0} = fl or Pflinin^oo \\X n - X\\ v = 0] = 1, 
i.e., X n a s > A. Thus any Id-valued random variable is strongly measurable if Id is 
separable. 

For a simple Id-valued random variable A = J2n=i x nl A„> if P[A„] is finite 
whenever x n ^ 0, then A is integrable and its integral, called the Bochner integral 
E[A] of AT, is 


E[A]= [ XdF= / X(w)P(rfw)=Va) n P[4]. 

Jn Jn n=1 

As in [44] . the real valued random variable ||A(-)||y is measurable for any Id- 
valued random variable A. Then a V-valued random variable A is Bochner inte¬ 
grable if 

E[||*U= / \\X\\ v dF= [ \\X(to)\\ v F(dio)<™. 

Jn Jn 

Since a Id-valued random variable X is strongly measurable, there is a sequence 
{X n } of simple Id-valued random variables such that X n ^AX, then 

( X n (uj), if || X n (uj)\\y < |||A(o;)||y, 

Y n (u) = l 


0 


otherwise, 
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forms a sequence {Y n } of simple V -valued random variables such that ||5^j|| y < 
|||X|| y and Y n -—>X. Since ||X||y is integrable, by Dominated Convergence The¬ 
orem, 

lim E[p--y„|| v ] =E 

n—> oo 

holds, and {E[F„J} is a Cauchy sequence in V. Then the Bochner integral of X is 


lim II X -Y n 


= 0 , 


E[X]= / XdP= lim / Y n dP= lim E[Y„], 

Jn n - >0 ° J n n ^°° 

and the limit is the same for any sequence {X„} of simple V -valued random vari¬ 
ables satisfying linin-xx, E[||X — X n ||y] = 0. 

Let L P (Q,] V) be the Bochner space of Bochner integrable, H-valued random 
variables X such that the corresponding norm 

f (E[PTO', 1 <p< oo, 

ll-^"llLP(n ; v) — | 

[ esssup^gQ ||X(w)|| v , p = oo, 

is finite, where X is the equivalence class with respect to the equivalence relation 
X ~ Y if and only if X = Y almost surely. 

If V is a separable Hilbert space with the inner product (') 'V> then the Bochner 
integral of the inner product of two independent H-valued random variables X and 
Y equals the inner product of their Bochner integrals, 


E [( A ^V] = / (x,y) v Vx,Y(dx,dy)= (x,y) v fj, x (dx)ii Y (dy) 

JvxV Jvxv 


IVxV JVxV 

(X(wi), Y (w 2 ))y P(dwi) P(dw 2 ) 




(X (wi), Y ( uj 2 )) v P (du} 2 ) 


X(wi), [ y(w 2 )P(dw 2 ) 
Jn 


P(dwi) 

P(dwi) 


/ X(u>i) P(dwi), / y(w 2 )P(dw 2 ) 

In Jn / v 

= (E[X],E \Y]) V . 


(4) 


Here the classical Fubini theorem and Proposition 1.6 in Chapter 1 of HD are used 
as well as the property of the independence. Finally, the variance of a H-valued 
random variable is 


V[X] 


E 


||X-E[X]|| 2 y 


= E 


Pill 


2pCE[.Y])y + ||E[X]||y 


\X\ 

\x\ 


2||E[X]||y + E 


imu 


2 

V 


imu 


2 


which coincides with the usual definition of the variance. 
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3. Order of Convergence and Complexity 


When liniAT-^oo E[Xjv] = E[A] for some random variable X and its approxi¬ 
mation Xn, the order of convergence is p if ||E[A/v — -X - ] || < CN~ P , where C is 
independent of N and ||-|| is the proper norm under the given context of conver¬ 
gence, see [18] . 

The computational cost C(Xn) is the number of floating point operations to 
compute Xjsi which is the realization of Xjy. Implicitly, Abv should satisfy some 
criteria of convergence, for example, the error between Xn and Xjy is less than or 
equal to the given number e. This means that the trivial choice of Xjy should be 
excluded. 

Let V = Hq(D) be a separable Hilbert space with the inner product (•, -) v and 


its associated norm 


such that 


(u,v) v = j (Vu- Vu-f uv)dx, |M| V = yj(u,u) v . 


3.1. Monte Carlo Method. Let {wfc}/E C V be solution samples corresponding 
to N independent, identically distributed realizations of random input data, and 
Ej$ G (u) the mean of them by the Monte Carlo (MC) method defined as 


E- 


MC 

N 


1 N 

(“) = v£ 


Uk G V. 


k =1 


The MC estimator satisfies the unbiased property as follows 

1 

N 


N N 

e[^“°(«)] = a7E e ki = a?E e m=em, 


fc =1 


fc =1 


since (ufc} fc=1 are independently chosen following the identical distribution. And 
the variance of the MC estimator is 


= E 


^rw-E[«] 


N 


v[a“ c ( m )] =e[||l;^ c (u)-e[l;“ c ( u )]| 

The error e“ c (w) between E^ c (u) and E[u] G V, 

N „ 

e N C ( u ) = e n C (u) - E[u] = E Uk ~ E M = Jf E ~ E M) 

k =1 fc= 1 

has independent terms Uk — E[u]. Using Q, we know that the Bochner integral of 
the inner product between mutually independent terms Uk — E[u] and Uj — E[u] for 
k ^ j, becomes 

E[(ufc - E [u\,Uj - E[u]) v ] = (E[tt fe - EtuJJjEfuj - E[u]]) v = 0, 

since E[itfc — E[u]] = E[itfc]—E[u] = 0 from the unbiased property of the MC method. 
Then the mean square error is 


|e™ c ( M )|r = E[||2$ C ( U )-E[ U ]|| V = V [E% c {uj\ 

1 


N 2 


1 


E 


N 

2 ' 

i N 


E _ E M) 


ii 

31- 

M 

m 

II Uk - E[lt]||y 

k =1 


k= 1 



= —E 

N 


, — E[i 


w 


= >[“] s v E 


= (5) 
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where ||-|| = is the norm of L 2 (fl; V). Thus, the relative error of the MC 

method is less than or equal to 1/y/N, that is, the order of convergence of the MC 
method is 1/2. Precisely, let e be the desired error bound for the MC estimator, 
i.e., ||e^ c ( , u)|| < ||u||/\/]V = e, then N = ||m|| 2 £- -2 is the best choice to attain the 
desired error. That is, we must increase the number of samples fourfold to decrease 
the error by half. 


3.2. Single Level Monte Carlo Method. Let 7/ be the triangulation of D into 
simplices with a mesh size hi for l £ N, and nodes in 77—l belong to those in 7/ 
which ensures the hierarchical structure of triangulation. Let Vi be the space of 
piecewise linear functions on 7), i.e., 

Vi = {v £ V : v\ K £ Vx (7Q, V I< £ 71}, 

where Vi(K) is a linear polynomial space on a triangle I\ £ Ti . Let {wz,fc}fclli C Vi 
be Galerkin Finite Element approximations in Vi corresponding to the realizations 
of the random coefficient. Then the Single Level Monte Carlo Finite Element 
approximation 7/ M y' (u) in Vi is defined by 


N, 


Kn, («) = ^ E u hk € L 2 (n- Vi). 


k= 1 


From the equality relation of (|5|), the variance of Ef^ (u) is 

v[<c (u) ] =v[E^M] = i-vM. 

Let (u) be the error between and E[u], 

(«) = Kn, («) - E M = Kni («) - IEM («)] + E M - E M • 


Then we expand the mean square error 

I 2 


,MC 

'l,N t 


(u) 


as follows 


|e^»|| = E ||E^( m )_ E ^mc ( m) ] +EW _ EM 


= E 


|E, m C( m )_e[^ m C( m) ] 


-E 


PM-EMIK 


+ 2(E[(£®i t) -E[E^ (u) ] )E M -E[u]) v \) 

= V[E™C(u)] + ||E[«i] - EMU 2 = ±.V[ui] + PM - u]|| 2 . 

Here, the Bochner integrals of inner products between different deviations are zero 
due to the unbiased property of the MC method and the relation (H]) for two inde¬ 
pendent random variables. The computational cost of the estimator by the Single 
Level Monte Carlo method is 


= N t C h 

where Ci is the mean computational complexity at level l by the finite element 
method. The main difference to the MC method is the use of the finite space 
to approximate a solution for a realization of the randomness, which results the 
approximation error. 
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3.3. Multilevel Monte Carlo Method. Set uo = 0, and wi = ui — ui-i for 
l = 1, ... ,L, where L is the maximum level. Clearly, ul = XLi ( u i ~ u l-1 ) and 
further we have 

L L L 

E [ul ] = E E I U * ~ u *-i] = E E M = E M + E E N] ’ 

1=1 1=1 1=2 

From the above observation, the Multilevel Monte Carlo (MLMC) Finite Element 
approximation E^ h (u) is defined by 

ET{u) = Y = Ef?(ui) + Y En?W- 

1=1 1=2 

Since E [E^(wi)] = E [wi] by the unbiased property of the MC method, we have 

L L 

e[e™ l (u)] =Y e [< c M] = E E H = E[«l]. 

1=1 1=1 


The variance of the MLMC estimator is 


V[£f L (u)] = E ||£^ L (u) -E[E^ h (u) 


= E 


E - e m)| 

i=i 

^e[|K c (u,o-eM|| 

1=1 

2 f E « C M - EM, E%?(w k ) - EK]), 

YYEn?(™i)] = Y ^ V M = E - u »-i]» 


i=i 


i=i 


i=i 


N t 


since the deviations E^2(wi) — E[tuj] are mutually independent due to different 
samples. Let e^ L (u) be the error between E^ L (u) and E[it] 

e L L (' u ) = E™ L (u) - E[«] = E™ l (u) - E[£f L (u)] + E [u L \ - E[u]. 

The mean square error becomes 


ML 

e L 


(«)| ' 


= E 
= E 


E™ h (u) -E[E™ h (u)] + E [u L ] - E[u] 


£f L («)-E[i? t ML ( u )] 


•E 


||E[ui] -E[u]|| 


v 


+ 2E[(E™ l (u) - E[£^ l (u)],E[u l ] - E[u]) 

= v[^ l W] + I|eK]-emh 2 
L i 

= E w Y ^ ui ~ ui ~^+ i |E t Ui - M ]ii 2 - 


i=i 
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The computational cost C(E^ L (u)) of the MLMC estimator is 
L L -1 

C(E^{u)) = +C t -i) = N l C l + Y, (■ Ni+N l+1 )C t , C 0 = 0, 

i=i i=i 

where C/ contains the mean complexity at level l including differencing cost. Cliffe 
et al. [IS] discuss the cost saving in the decay tendency of the variances, compared 
to the MC method. 


3.4. Projected Multilevel Monte Carlo Method. Let Pi be the projection 
from Vi into Vz_i, i.e., 


{Piui,v) v = (ui,v) v , V d 6 i, (6) 

and Pi = 0. Set wi = (/ — P;)rtp then the Projected Multilevel Monte Carlo 
(PMLMC) Finite Element approximation P£(w) is 

e p l ( u ) = y < c (( 7 - = E + E < C M- 

1=1 1=1 1=2 

Note that we replace P^ c (wj_i) in the MLMC method by the projection PiE^) c (u{) 
of Ej$p(ui) in order to simplify the computational complexity. Using the unbiased 
property of the MC method, we have 

L L L 

e[p£( u )] = E e [< c M] = E E H = E M + E E N' 

1=1 1=1 1=2 


Since K[E^ c (wi)] = E [wi], the variance of the PMLMC estimator is 


J Ni 

?P 


Y[p£( M )] = E \\El(u)-E[E p L (u)]\\ v 


= E 


E ( E n?(wi) - EM) 


1 = 1 


= ^e[||(p“ c (^)-eh)| 


v 


V 


1=1 


2 ^E[(P“ C M - EW,F“ C ( ^) - E[w k ]) 


uJ 


, l^k 


= Y = y ^v[ UJ - Pmi 


1=1 


1=1 


1=1 


Ni 


since we use different samples at each level, in other words, the deviations are 
independent to each other. The error of the PMLMC estimator is defined by 

e p L (u)=E p L (u)-E[u}. 
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Then the mean square error is 


e£0) || 2 = E 

||p£(n)-E[p£( u )] +e[p£(«)] -EMllE 

= E 

)\E p l (u)-E[E p l (u)]\\ 2 v 

+ E 

)|E[P]»] - 


+ 2(E[(E p l (u)-E[E p l (u)],E[E p l (u)} -E[«]) v ]) 
= V[p£(u)] + ||E[p£(u)] -E[n]|| 2 

= E ~ -«] II 2 - 


2 ' 
V 


Using the regular triangulation condition briefly stated in Section 13.51 see [H ITB1 
S3 G>2 for various conditions, we can bound the first variance term as follows. 


Lemma 1. Let 71 be a triangulation of D to form an approximate space Vi, and 
{a,i}? =1 the set of vertices of a triangle K £ 77 —1 - For a fine grid solution ui £ Vi, 
the bound of the variance of u\ — Piui is 

3 

Y[ui - Pm) < C(hf_ 1 + l) E E Y[{ui - Piui)(mi(K)- ■)], 

KeTi -1 i= 1 

where C is a constant related to the regular triangulation, hi_\ is the maximum 
of diameters of triangles, {rnj(7f)} i=1 are mid points on edges of the triangle, and 
variances V[(zq — PiUi)(mi(K); •)] are usual variances for discrete values.. 


Proof. We can expand the variance of ui — Piui as follows 


Y[ui - Pm] 


\ui - Pm - E[ui - PlUi}\\\ 


Ml 


L 2 (D) 


- E 


|Vu|| 


L 2 (D ) 2 


by introducing an auxiliary deviation v = ui — Piui — E[iq — Piu{\. We obtain the 
bound in Section 13.51 using the regular triangulation condition. □ 


Now, we want to bound the error with respect to the approximation error. 
Theorem 2. The error e£(w) is bounded by 

L 

||e£(u)|| < coNI+E^-^II, (?) 

1=1 

where the constants ci are dependent on numbers of samples as 

( co = 7v 1 -^+2eEe i > 

I Cl = 1 + N~* +2N~*, 

| ci = 2 + 2N t ~i +2N i+1 -*, for l = 2,..., L — 1, 

{ c L =2 + 2N L ~Y 

Proof. The difference between P/P“ c (zi z ) and P]vj C ( u /-i) is 

= PiE^( Ul ) - P l E[u,] + PiE[ Ul ] - E[«,_!] + E^z-x] - E%?(u t - 1) 

= -Pie^(u{) + P l E[ui - ui-i] + e^ c (uz_i) 














PMLMC FOR PDE WITH RANDOM INPUT DATA 


11 


for l = 2,. ..,L since Pi = I on Vi-\. Then the norm || E^ L (u) — E p (u)\\ can be 
bounded by 


\E™ l (u)-E p l (u)\\ = i)) 

1=2 
L 

< E (ll e “ c (^)ll + ii e n -+ ll e “ c (^-i)ll) 
1=2 
L 

E (ll e “ C ( M 0|| + \\ui-ui-i\\ + ||e^ c (uj-i)||), 


< 


1=2 


and the error e p (u) can be bounded by 


\e P L (n)\\ < \\¥.[u]-Er(u)\\ + \\E^(u)~E p (u)\ 

L 1 

< !|m- u L ii + y,—^ - ui -if 


E 

1=2 


VNi 


Ml + ||z*Z—1II) + II Ul - Ul- 1| 


< coiiuii + E Q ii u_uj ii’ 


1=1 


in terms of ||it||, ||u — u;|| and ||u — u;_i|l from expansions of ||tt;|| = ||uz — u + u|| 
and ||u ; - Ui—i || = ||u; - u + u - U;_i||- □ 


When ||u|| is a little bit large, its effect can be negated by increasing the total 
number of samples as shown in Co in Theorem [2J since it depends only on the 
number of samples at each level. On the other hand, we can decrease the error only 
when the approximation error at each level has a good order of convergence from 
the constant dependence on ||u —uj||. Furthermore, if ui converges to u in mean 
square, then V[u>z] = ¥[(/ — Pi)ui] converges to zero as the level l increases, and 
fewer samples are needed on finer levels to estimate E[u>z]- Compared to the error 
bound of the MLMC method, that of the PMLMC method has more dependence 
on numbers of samples TV; for l > 1 to control the error ||u||. The approximate 
property at the level next to the finest grid should be better to ensure good rate of 
convergence. 

The computational cost C(E p (u)) of the PMLMC estimator E p (u) is 

L 

C(E p (u)) =J2 N iCi, 

i=i 


where Ci is the mean computational cost at level l including the projection cost. 
After rearranging (0 according to TV;, we obtain the following theorem. 
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Theorem 3. For given e > 0, the optimal Ni and the computational cost by the 
PMLMC method are 

N\ = ?7 2 £ _2 (||m - ui|| + ||u||)*Ci _ 3, 

< Ni=rj 2 £- 2 {2(\\u-ui\\ + \\u-ui- 1 \\ + \\u\\)}^Cri, 1 = 2, ( 8 ) 

„ c ( e l( u )) = rfe- 2 . 

Here, the auxiliary variable e = e — ||it — Ui|| — 2 ll u — u ;|| should be positive 
and another auxiliary variable g is 

r, = (||« - Ul || + H)^ + jr {2(|| u - UI || + ||« - + \\u\mhti. 

1=2 

Proof. We can rearrange with respect to Ni as follows 

L 

VM\ < ll«-Will + 2 ^2\\u-ui\\ +7V 1 _5 (||u|| + ||u-Ul||) 

1=2 

L 

+ 2 51 N ^(\\ u ~ Ul W + ll u_ Wi-ill + Hull) = g(Ni,.. .,N l ). 

1=2 

Set f{Ni, ..., N l ) = NiCi- The Lagrange’s multiplier method [3] for / under 
the constraint g = £, gives us the result. □ 

From the condition on e, the natural inequalities ||it — ui\\ < e must hold for all 
l = 1 ,,L. We can give a concrete version of Theorem [3] by specifying the finite 
element space as follows. 


Corailary 4. LetVi have only linear elements, i.e, ||u — iq|| = 0(h{), or ||w — tq|| < 
Chi, and hi = 2/q +1 for l = 1,..., L — 1, where C is independent of hi. Under the 
condition hi < C~ 1 e, we can make the following estimation 

(f+wfr+zd +mi 

1=2 V 

where the mean complexity at level l is Ci < C'hf 2d for another constant C' dif¬ 
ferent from C and the space dimension d = 1,2. Thus the complexity increases in 
proportion to the power 2 + 2d of the inverse of the desired error e. 


C(E£(u)) <4 1+d £- 2(1 + d )c 2 d c' 


3 2(l + ld) 

2 3 


Proof. Since ||w — iti|| < e should hold, the maximum mesh size hi at the coarsest 
grid would satisfy hi < C~ 1 e, i.e., h L < 2 1 ~ L C~ 1 e. Choose h L = 2~ L C~ 1 e, or 
hi = 2~ 1 C~ 1 £, then the resulting matrix to solve the problem is sparse. Inserting 
these values into 0 completes the proof. □ 


3.5. Completion of Lemma [T] using Variance from Linear Interpolation. 

Let K be a triangle in a triangulation 77—l to form a piecewise linear approximation 
space Vi-i. We make 71 by making 4 sub-triangles {Ki }*_ 1 from vertices {ai}i=i 
and mid points {uq } i=1 on edges of the triangle K as shown in Fig. [I] Here, mid 
points are defined by 


m k 


Qi + aj 


k ^ i, i ± j, j £ k- 


2 
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Any point x in the triangle K is expressed by 

3 3 

x = y ^Wj(x) aj, Ywj{x) = 1, 

i=1 i= 1 

where C K. are weights at x in K, called barycentric coordinates , deter¬ 

mined by proportional lengths as illustrated in Fig. |T] For u(-;w) £ Vi, its linear 




Figure 1. In Left, vertices {ai} i=1 and mid points {mi} i=1 on 
edges of a triangle K in 77— i and sub-triangles {A7jy =1 C 71 are 
shown in black and red dots. In Right, weights {wi}- 1 are drawn 
as proportional lengths for a point x with an auxiliary point x on 
the opposite edge to the vertex a\. 

projection Pu(-;oj) £ on a triangle K £ 71 -1 is 

3 

Pil\ k (x-uj) = Y w i( x ) u ( a L w), (9) 

i= 1 

by the definition ©. Since u is piecewise linear, we can express u on K as 

4 4 3 

u\ K (x-,u) = Y u \ kS x ' w } = EE Wi j( :c ) u ( ai 3; u ) 

j =1 i=l *'=1 

using {Ki}f =1 in Fig. [I] with ordered points {a^} and weights {t%-} 


ai 

m 3 

m 2 

mi 


Will 

Wl 2 

WIl 3 

WI14 

m 3 

0.2 

TOi 

m 2 

, Kj] = 

WI 2 1 

W 22 

w>23 

W> 2 4 

m 2 

TOi 

03 

m 3 


_ w; 3 i 

W 32 

WI 33 

WI 3 4 


Note that orientations of sub-triangles are determined by a^- and weights Wij have 
supports not on the whole K, but only on Kj, which are, for i, j = 1, 2, 3, 

Wij(x) = [ 2 Wi(x) - Sij] Wi4,{x) = [1 - 2wi(x)\ x*{x), 

where 5^ is the Kronecker delta and Xji x ) is the characteristic function on Kj. 
Using X;j: we express Pu\k as 

4 3 

Pu\ K (x\u) = y^y^w i (a;)xj(a;)u(a i ;a;). 

I=i *= i 
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The deviation v of the difference u — Pu and its mean E[u — Pu] has the form 


IK 


(x; to) = (u — Pu — E[u — Pu])\ K (x; to) = Y^ v \ k .{ x 'i <*>), 

j'=i 


where v\ k, are 


I Ki 


\ Ki 


= 2 (rjjW k + rjkWj), i ^ j, j ± k, k ^ i, 

3 

= ^2[Vi~ 2r liWi], 

2 = 1 


for i,j,k = 1,2,3 and deviations r/i of mid point values of u(-;u>) defined by 
r]i = u(nii;u>) — Pu(m*; u) — E[u(mj; •) — Pu(rn,; w)], i = 1,2,3, 
since u(di',u> ) = Pu(ai\u>) by the property of Pu from ©. We have 

„ 4 


IK 


' dx = ^ / v 2 dx 


.7=1 ^ 


where 


J v 2 dx = |i^il x i(j 7 | + r) k 2 + rjjrjk), i ^ j , j ^ k, k ^ i , 


dx = |iv 4 | x - I ^ ? ?i 2 + ^ rn 


I Ki 


6 


% 


, i=l i<j 

for i,j,k = 1,2,3. Here, |Aj| denotes the area of AT^ and has the same value, a 
quarter of |A'| by mid point rule. For iP(.D) norm, we calculate Vu on Aj as 

Vv| K . = 2(7]j\/w k + rjkVwj), j, j i , 

3 


Vu 


k 4 




2=1 


for i,j, k = 1,2,3. Their integrals are computed by 


/ |Vwj| 2 da; = 

iifi 


l^il 


4|iF| 


/. 
2 ‘j 


liv,; 


/ Via,- -S7wkdx =- 77 Ijlk cos 9i, j ^ k, k ^ Z, l ^ j, 

J Ki 41 A'| 

where k is the length of the edge opposite to the vertex a,; and 0 * is the angle at a* 
in A' for j, k, l = 1, 2, 3. After arranging the summation, we obtain the following 
„ _ 3 3 


/ if 




|Vu| 2 dx < 


2 — 1 

3 3 


2 — 1 
3 


> K 


2—1 2=1 

Here, C > 5/48 is a constant for the regular triangulation such that the diameter 
p of the incircle and the longest side h of K satisfy a relation h/p < C/& for 
all K G 77 — 1 5 see Zlamal’s minimal angle condition [52], Ciarlet’s inscribed ball 
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condition [16] , and the maximum angle condition by Babuska et al. [4] and Jamet 
m- Now, the expectation of the square of H l (D )-norm of the deviation is 


E 


Hli 


= E 


IMIl 2 (D) + \\'^ v \\ l 2 ( d ) 2 


< c(h* + i) J2 E% 2 ] 

K&Ti-i i= 1 

= c(h 2 + 1) E v t( u - Pu )( w --)]- 

KeTi-1 i= 1 


Note that the left-hand side is the variance of u — Pu and variances at the right- 
hand side are usual variances for discrete values. The proof of Lemma |T| is done by 
replacing h, u — Pu and Wj with hi- 1 , ui — Piui and rrii(K), respectively. 


4. Numerical Simulations 

We use the finite element method in the piecewise linear function space to ap¬ 
proximate the Darcy flow @ with a fixed coefficient. We examine the performance 
of MLMC and PMLMC methods in computing the mean values of pressure fields 
of the Darcy flow ([2]) in D = (0, l) d for d = 1,2. As [18], we use the covariance 
operator ([3]) to make the logarithm of the coefficient k in |2]) by the KL expansion 
for the case A = 0.1 and a 2 = 1. 

4.1. Results in ID. For the KL expansion, we use Aa; = 1/8192 to generate 
eigenvalues and eigenfunctions up to 1000 modes. We use the mesh size h = 1/4096 
for the fine grid and h = 1/2048 for the coarse grid. The mean of 200000 samples 
by the MC method is regarded as E[u] for the comparison of MLMC and PMLMC 
methods. 

We measure the actual CPU time in seconds on a 3.07 GHz Intel Core i7 processor 
with 12 GB of RAM using a Matlab [36; code. It takes about 2 days and 14 hours 
in CPU time for the calculation of 200000 sample solutions. We observe that the 
PMLMC method speeds up the correction step at the fine grid by at least 12% 
compared to the MLMC method as tabulated in Left of Table [T] and depicted in 
Fig. [2] The PMLMC method keeps the almost same order of errors as shown in 
Right of Table [I] We use a fixed number of samples at the fine grid for MLMC and 
PMLMC methods as described in the caption of Right of Table |T| and illustrate the 
results in Fig. [3] 

4.2. Results in 2D. We make eigenvalues in 2D from the tensor products of those 
in ID, see Cliffe et al. [18] for details. We re-order eigenvalues in magnitude and 
truncate them up to 1000 modes in the rectangular grid of size 1/128. 

We use the triangular mesh to compute approximations. The coarse grid is 
generated by DistMesh [42] with a mesh size h = 0.0442. The fine mesh is made 
from the coarse grid using the mid point rule, and has a mesh size h = 0.0221. The 
fine and coarse mesh grids form a hierarchical mesh system and make it possible to 
reduce the error by decreasing the unmatched node points between coarse and fine 
grids. Numbers of elements and nodes are 2934, 1262 at the coarse grid and 9576, 
4917 at the fine grid, which means the fine grid has more than triple elements and 
nodes compared to those in the coarse grid. 
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Table 1. In Left, CPU time (sec) versus IV 2 (number of samples 
at the fine grid) for MLMC and PMLMC methods in ID. In Right, 
If 1 (I?) error versus N (total number of samples) for MC, MLMC 
and PMLMC methods in ID. The number of samples at the fine 
grid is fixed as IV 2 = 50 for MLMC and PMLMC methods. 


CPU Time (sec) 

H L (D) error 

n 2 

MLMC 

PMLMC 

N 

MC 

MLMC 

PMLMC 

50 

57 

49 

100 

9.0261 x 10 - * 

1.2547 x 10 -1 

1.2548 x 10 i 

200 

227 

198 

250 

7.5006 x 10"^ 

7.6602 x 10^ 

7.6599 x 10" 2 

800 

904 

780 

850 

“331393^0^ 

3.2706 x 10 ;i 

3.2708 x lO - * 

3200 

3626 

3154 

3250 

2.1194 x 10 - * 

232330 x"Tcf r ^ 

2.2328 x 10 ;i 

12800 

14495 

12574 

12850 

8.5928 x 10 _a 

9.3939 x 10" 3 

9.3952 x 10 _ii 



N 2 (number of samples at the fine grid) 


Figure 2. CPU time (sec) versus N 2 (number of samples at the 
fine grid) for MLMC and PMLMC methods in ID corresponding 
to Left of Table [1] 


We use the same machine with a Matlab code in 2D. It takes 6 days and 15 
hours in CPU time for the calculation of 30000 sample solutions. The PMLMC 
method speeds up at the fine grid by at least 18% as tabulated in Left of Table 
[2] and depicted in Fig. [I] In Right of Table [2] we doubles numbers of samples at 
the fine grid while quadruples those at the coarse grid. We illustrate the results in 
Fig. [5] The main reason to increase N 2 is due to the slow convergence rate. We 
expect that we can fix N 2 when the mesh size is small enough as in the case of ID. 

4.3. Cost Savings. For comparison of CPU time, we tabulate CPU times for MC, 
MLMC and PMLMC methods in Table [3] We illustrate the results in Fig. [6] and 
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N (total number of samples) 


Figure 3. H l {D) error versus N (total number of samples) for 
MC, MLMC and PMLMC methods in ID corresponding to Right 
of Table [H The number of samples at the fine grid is fixed as 
N 2 = 50 for MLMC and PMLMC methods. 


Table 2. In Left , CPU time (sec) versus N 2 (number of sam¬ 
ples at the fine grid) for MLMC and PMLMC methods in 2D. In 
Right , H 1 (D) error versus N (total number of samples) for MC, 
MLMC and PMLMC methods in 2D. N 2 are 50,100, 200,400, 800 
for MLMC and PMLMC methods. 


CPU Time (sec) 

H 1 (D) error 

n 2 

MLMC 

PMLMC 

N 

MC 

MLMC 

PMLMC 

50 

1179 

957 

100 

6.7379 x 10'" 


9.1353 x 10 ;i 

200 

4728 

3838 

300 

5.5589 x 10 -2 

6.2854 x 10^ 

lEiesTirnr 33- 

800 

18911 

15351 

1000 

- 2T383lT[(r T3- 

3.7902 x 10 -2 

3.7713 x lO - * 

3200 

75644 

61405 

3600 

1.0850 x 10 - * 

3.0330 x 10" 14 

3.0255 x 10 ;i 

12800 

302577 

245615 

13600 

5.2386 x 10 _ii 

2.9398 x lO-' 

2.9226 x 10”^ 


Fig. [7] for ID and 2D, respectively. In ID, the computational cost savings by the 
PMLMC method are constant, since the projection occurs only 50 times. 

In the fine grid, the PMLMC method spends almost same time as the MC method 
does, while the MLMC method does 14 % in ID and 20 % in 2D more time than 
the MC method does, as tabulated in Left of Tableland Table [2] for ID and 2D, 
respectively. We can say that the PMLMC method saves the computational cost 
further than the MLMC method. This is due to the solution procedure for the 
MLMC method in the coarse grid. To the contrary, the PMLMC method uses the 
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Figure 4. CPU time (sec) versus N 2 (number of samples at the 
fine grid) for MLMC and PMLMC methods in 2D corresponding 
to Left of Table [2] 


projection, a cheaper procedure compared to the MLMC method in the view of 
cost savings. 

Table 3. CPU time (sec) versus N (total number of samples) for 
MC, MLMC and PMLMC methods in ID {Left) and 2D {Right). 

The number of samples at the fine grid is fixed as N 2 = 50 for 
MLMC and PMLMC methods. 


CPU Time (sec) in ID 

CPU Time (sec) in 2D 

N 

MC 

MLMC 

PMLMC 

N 

MC 

MLMC 

PMLMC 

100 

99 

68 

60 

100 

1918 

1409 

1188 

250 

246 

100 

93 

300 

5754 

3263 

2825 

850 

837 

227 

218 

1000 

19187 

8425 

7508 

3250 

3201 

734 

726 

3600 

68995 

24100 

22243 

12850 

12729 

3103 

3091 

13600 

262352 

77108 

73220 


5. Conclusion 

From the error estimations in Section [3] we confirm that the order of convergence 
of the MC method is 1/2 which means that we need quadruple samples to decrease 
the error in half in the numerical simulation. MLMC and PMLMC methods are 
faster than the MC method and show the same order of convergence 1/2. When 
the most computation occurs at the coarse grid, the computational cost due to the 
increase of samples does not increase, rather decreases proportional to the ratio of 
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N (total number of samples) 


Figure 5. H l {D) error versus N (total number of samples) for 
MC, MLMC and PMLMC methods in 2D corresponding to Right 
of Table [2] N 2 are 50,100,200,400,800 for MLMC and PMLMC 
methods. 


the number of samples at the coarse grid to that at the fine grid. In the PMLMC 
method, we use the projection from the fine grid to the coarse grid to replace 
the approximation at the coarse grid in the MLMC method, which results in the 
reduction of the computational cost further than the MLMC method. 

The PMLMC method upgrades values at mid points of edges in the coarse grid 
when we use a hierarchical grid structure under mid point refinement scheme. This 
means that we can use small number of samples during the correction procedure in 
the fine grid, since the main structure of the mean value is estimated at the coarse 
grid by illustration of convergence of error in Fig. [3] and Fig. [5] Non-conforming 
finite element methods can be used for the PMLMC method since the PMLMC 
method does not require the inclusion of two consecutive approximate spaces. 

The variance analysis and optimal number of samples with illustrations of them 
through numerical simulations in ID and 2D are shown for the completeness of the 
paper. In the near future, we would present the results on MLMC and PMLMC 
methods combined with conforming and non-conforming finite elements applied to 
the Helmholtz equation with a random coefficient and wave equation with a white 
noise. 
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